Goto

Collaborating Authors

 datawork summit


Dataworks Summit - Big Data meets multi-cloud

#artificialintelligence

'The network is the computer' was the mantra of the early days of connected systems, but it took the Internet to fully realize the concept. In today's era of smart sensors, cheap storage and sophisticated algorithms, an apt aphorism might be'the data is the business' in that business decisions, new services and product strategies are fueled by the analysis of massive amounts of mundane data. The ability to collect, store and analyze such routine data as transaction records, system logs, sensor readings and location information with increasing granularity has the potential to turn what was formerly lost or ignored information into valuable business assets. The organizations that are most adept at spinning the digital straw into gold find themselves at a significant competitive advantage. Aside from the advances in core infrastructure, perhaps nothing has been as responsible for the rise of data-inspired business decisions as the Hadoop ecosystem of open source distributed data storage and processing software.


Quick! Quick! Exploration!: A framework for searching a predictive model on Apache Spark - DataWorks Summit

#artificialintelligence

Research and development of machine learning (ML) algorithms are a hot topic in data analytics. Novel OSS ML libraries are continuously proposed such as Google TensorFlow and XGBoost of Washington U. As choices of ML algorithms and libraries are increasing, model selection is getting a serious pain of data analytics in a bunch of business use cases. Despite the development of ML technologies, achievement of high accuracy essentially requires hyper parameter tuning in big search space. Data scientists have to execute ML algorithms hundreds to thousands times by switching OSS and hyper parameter configurations, which last several days. Data preprocessing is also one of data scientists' big headache because model selection among a bunch of ML OSS requires format conversion and saving the converted data to storage for each OSS.


AI and Data Science Trends at DataWorks Summit - DataWorks Summit

#artificialintelligence

This year, I'm honored to be the chair of the Artificial Intelligence and Data Science track at the DataWorks Summit in San Jose. Reviewing the submissions and working with the experienced and sharp committee members has been an education in itself, in particular the chance to see what's trending in the open source world. My day-to-day data science work gives me the chance to dig into a few open source projects, but it's hard to find time to get an overview of which topics and projects are hot and worth exploring more deeply. The key topics emerging this year are deep learning, graph-based machine learning and model inference in production. Not surprisingly, the topics and tools around deep learning (DL) still top the list of big trends, and top-notch research in math and computation are driving progress across vision, speech and text.


AI and Data Science Presentations to Look Forward to at DataWorks Summit - DZone AI

#artificialintelligence

Not surprisingly, the topics and tools around deep learning (DL) still top the list of big trends, and top-notch research in math and computation are driving progress across vision, speech, and text. Many in the DataWorks audience are already developing cutting-edge deep learning systems, while others are just beginning to start playing with DL. Either way, I suggest attending Magnus Hyttsten's talk on getting started with TensorFlow. As you read this article, a new DL framework might already be baking and being open-sourced. It's getting harder and harder to keep track of all the new DL frameworks and their capabilities.



Artificial Intelligence and Analytic Ops to Continuously Improve Business Outcomes - DataWorks Summit

#artificialintelligence

The time for enterprises to gain market advantage through Artificial Intelligence is now. Already many AI-enabled advances are transforming business processes and customer experiences, but the vast majority of AI-enhanced use cases are still to be discovered, developed, and deployed. In order to discover and capture the value available through deployed AI, new deep learning techniques are the focus of feverish research and development in academia and business. However, even successful AI experiments are often never deployed to business operations, resulting in wasted effort, time, and money, and leaving businesses dangerously exposed to competitors that have integrated AI into their ongoing operations. Experimentation with AI is essential to realizing the promise of AI, but enterprises face substantial risks that their experiments with AI, even successful ones, will do nothing to improve their business outcomes.